Identifying Suspicious Bidders Utilizing Hierarchical Clustering and Decision Trees
نویسندگان
چکیده
Identifying bidders with suspicious bidding activities related to possible online auction fraud is a difficult task due to a large number of users participating in online auctions. In order to reduce the number of users to be investigated, we examine observable features of a bidder’s behavior, and utilize a hierarchical clustering technique to divide a collection of bidders into normal and deviant groups. Based on the clustering results, we generate a decision tree that can be used to efficiently characterize new bidders as normal, suspicious, or highly suspicious. To illustrate the effectiveness of our proposed approach, we collected real auction datasets from online auctions, and used 3-fold validation approach to show that the error rates of the generated decision trees are reasonably low.
منابع مشابه
A Real-Time Self-Adaptive Classifier for Identifying Suspicious Bidders in Online Auctions
With the significant increase of available item listings in popular online auction houses nowadays, it becomes nearly impossible to manually investigate the large amount of auctions and bidders for shill bidding activities, which are a major type of auction fraud in online auctions. Automated mechanisms such as data mining techniques were proven to be necessary to process this type of increasin...
متن کاملClassification and Cluster Analysis of Complex Time-of-Flight Secondary Ion Mass Spectrometry for Biological Samples
Identifying and separating subtly different biological samples is one of the most critical tasks in biological analysis. Time-of-flight secondary ion mass spectrometry (ToF-SIMS) is becoming a popular and important technique in the analysis of biological samples, because it can detect molecular information and characterize chemical composition. ToF-SIMS spectra of biological samples are enormou...
متن کاملClustering Trees with Instance Level Constraints
Constrained clustering investigates how to incorporate domain knowledge in the clustering process. The domain knowledge takes the form of constraints that must hold on the set of clusters. We consider instance level constraints, such as must-link and cannot-link. This type of constraints has been successfully used in popular clustering algorithms, such as k-means and hierarchical agglomerative ...
متن کاملInferring Hierarchical Clustering Structures by Deterministic Annealingby Deterministic Annealing
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an ,aL”G.,P 4Lnrt;nn C.-e hM ,...
متن کاملInferring Hierarchical Clustering Structures by Deterministic Annealing
The unsupervised detection of hierarchical structures is a major topic in unsupervised learning and one of the key questions in data analysis and representation. We propose a novel algorithm for the problem of learning decision trees for data clustering and related problems. In contrast to many other methods based on successive tree growing and pruning, we propose an objective function for tree...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010